Mining closed and multi-supports-based sequential pattern in high-dimensional dataset
نویسندگان
چکیده
Previous mining algorithms on high dimensional datasets, such as biological dataset, create very large patterns sets as a result which includes small and discontinuous sequential patterns. These patterns do not bear any useful information for usage. Mining sequential patterns in such sequences need to consider different forms of patterns, such as contiguous patterns, local patterns which appear more than one time in a special sequence and so on. Mining closed pattern leads to a more compact result set but also a better efficiency. In this paper, a novel algorithm based on BI-directional extension and multisupports is provided specifically for mining contiguous closed patterns in high dimensional dataset. Three kinds of contiguous closed sequential patterns are mined which are sequential patterns, local sequential patterns and total sequential patterns. Thorough performances on biological sequences have demonstrated that the proposed algorithm reduces memory consumption and generates compact patterns. A detailed analysis of the multi-supports-based results is provided in this paper.
منابع مشابه
Sequential Pattern Mining by Pattern-Growth: Principles and Extensions
Sequential pattern mining is an important data mining problem with broad applications. However, it is also a challenging problem since the mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Recent studies have developed two major classes of sequential pattern mining methods: (1) a candidate generation-and-test approach, represented by (i) GSP...
متن کاملClosed Sequential Pattern Mining in High Dimensional Sequences
High dimensional sequences, such as biological sequences, are characterized by a small number of transactions, and a large number of items in each transaction. Mining sequential patterns in the sequences need to consider different forms of patterns, such as contiguous patterns, local patterns which appear more than one time in a special sequence, and so on. Mining closed patterns might lead to ...
متن کاملMining Sequential Pattern of Multi-dimensional Wind Profiles
Wind has become increasingly important as a source of energy although the generation of wind energy is quite erratic because of its changeable nature. For a given location, wind speed and direction change over time and at different heights. Previous studies have discovered different pattern of wind profiles, however an improved understanding of its spatial, temporal and variation in heights is ...
متن کاملSpatial modelling of zonality elements based on compositional nature of geochemical data using geostatistical approach: a case study of Baghqloom area, Iran
Due to the existence of a constant sum of constraints, the geochemical data is presented as the compositional data that has a closed number system. A closed number system is a dataset that includes several variables. The summation value of variables is constant, being equal to one. By calculating the correlation coefficient of a closed number system and comparing it with an open number system, ...
متن کاملAn Efficient System Based On Closed Sequential Patterns for Web Recommendations
Sequential pattern mining, since its introduction has received considerable attention among the researchers with broad applications. The sequential pattern algorithms generally face problems when mining long sequential patterns or while using very low support threshold. One possible solution of such problems is by mining the closed sequential patterns, which is a condensed representation of seq...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. Arab J. Inf. Technol.
دوره 12 شماره
صفحات -
تاریخ انتشار 2015